Neweyes: A System for Comparing Biological Sequences Using the Running Karp-Rabin Greedy String-Tiling Algorithm

نویسنده

  • Michael J. Wise
چکیده

A system for aligning nucleotide or amino acid biosequences is described. The system, called Neweyes, employs a novel string matching algorithm. Running Karp-Rabin Greedy String Tiling (RKR-GST), which involves tiling one string with matching substrings of a second string. In practice, RKR-GST has a computational complexity that appears close to linear. With RKR-GST, Neweyes is able to detect transposed substrings or substrings of one biosequence that appears rearranged in a second sequence. Repeated substrings can also be detected. Neweyes also supports a form of matching-by-group that gives the effect of different amino acid mutation matrices. Neweyes can be used in a macro mode (searching a database for a list of biosequences that are similar to a given biosequence) or in a micro mode, where two biosequences are compared and more detailed output formats are available.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Neweyes: A System for Comparing Biological Sequences Using the Running Karp-Rabin Greedy String-Tiling Algorithm1

A system for aligning nucleotide or amino acid biosequences is described. The system, called Neweyes, employs a novel string matching algorithm, Running KarpRabin Greedy String Tiling (RKR-GST), which involves tiling one string with matching substrings of a second string. In practice, RKR-GST has a computational complexity that appears close to linear. With RKR-GST, Neweyes is able to detect tr...

متن کامل

Running Karp-Rabin Matching and Greedy String Tiling

A system for aligning nucleotide or amino acid biosequences is described. The system, called Neweyes, employs a novel string matching algorithm, Running Karp-Rabin Greedy String Tiling (RKR-GST), which involves tiling one string with matching substrings of a second string. In practice, RKR-GST has a computational complexity that appears close to linear. With RKR-GST, Neweyes is able to detect t...

متن کامل

University of Sheffield - Lab Report for PAN at CLEF 2010

This paper describes the University of Sheffield entry for the 2nd international plagiarism detection competition (PAN 2010). Our system attempts to identify extrinsic plagiarism. A three-stage approach is used: pre-processing, candidate document selection (using word n-grams) and detailed analysis (using the Running Karp-Rabin Greedy String Tiling string matching algorithm). This approach achi...

متن کامل

External Plagiarism Detection using Information Retrieval and Sequence Alignment - Notebook for PAN at CLEF 2011

This paper describes the University of Sheffield entry for the 3rd International Competition on Plagiarism Detection which attempted the monolingual external plagiarism detection task. A three stage framework was used: preprocessing and indexing, candidate document selection (using an Information Retrieval based approach) and detailed analysis (using the Running Karp-Rabin Greedy String Tiling ...

متن کامل

Implementation of Rabin Karp String Matching Algorithm Using MPI

Searching for occurrences of string patterns is a common problem in many applications. One of the easiest approaches is to search a pattern in a text character by character. But this method becomes very slow when we deal with long patterns. Various good solutions like Naïve string Matcher, Knuth-Morris-Pratt (KMP), Rabin Karp (RK), etc have already been presented for string matching and we sele...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proceedings. International Conference on Intelligent Systems for Molecular Biology

دوره 3  شماره 

صفحات  -

تاریخ انتشار 1995